Optimal High-Dimensional Entanglement Concentration for Pure Bipartite Systems

Considering pure quantum states, entanglement concentration is the procedure where, from N copies of a partially entangled state, a single state with higher entanglement can be obtained. Obtaining a maximally entangled state is possible for N=1. However, the associated success probability can be extremely low when increasing the system’s dimensionality. In this work, we study two methods to achieve a probabilistic entanglement concentration for bipartite quantum systems with a large dimensionality for N=1, regarding a reasonably good probability of success at the expense of having a non-maximal entanglement. Firstly, we define an efficiency function Q considering a tradeoff between the amount of entanglement (quantified by the I-Concurrence) of the final state after the concentration procedure and its success probability, which leads to solving a quadratic optimization problem. We found an analytical solution, ensuring that an optimal scheme for entanglement concentration can always be found in terms of Q. Finally, a second method was explored, which is based on fixing the success probability and searching for the maximum amount of entanglement attainable. Both ways resemble the Procrustean method applied to a subset of the most significant Schmidt coefficients but obtaining non-maximally entangled states.


Introduction
Quantum entanglement is the most known, remarkable, and useful quantum resource in the quantum information (QI) theory [1] as it underlies several QI protocols, such as dense coding [2], entanglement swapping [3], quantum teleportation [4], and quantum cryptography [5]. For instance, in the bipartite scenario, two users who want to communicate-usually called Alice and Bob-can share an entangled state [6]. In this case, the ability to transmit information encoded in the state shared by Alice and Bob depends on the amount of entanglement [7,8]. Moreover, the most favorable case for faithful communication is when Alice and Bob share a maximally pure entangled state (MES) [9]. However, even if it was the initial state, the quantum noisy channel used to send the information will produce a loss of correlations in the MES [10]. Moreover, the quantum operations needed to carry out a particular quantum application are performed imperfectly due to the experimental errors, yielding to fidelities of less than one [11].
In such cases where they have access only to a partially entangled state ρ, it is desirable to access a channel that allows a more faithful way to send quantum information. One solution is to implement protocols to increase the amount of entanglement [12,13]. These protocols are known as entanglement purification or entanglement distillation [14][15][16] and entanglement concentration [17]. These methods are based on the fact that local operations and classical communication between Alice and Bob cannot increase, on average, the amount of entanglement in the initially entangled pairs [18].
In the case of entanglement purification, the goal is to increase the purity and the entanglement in the initial state ρ, but under the cost of reducing the number of initial copies available. It can only be implemented successfully in a probabilistic way [14]. Moreover, an experimental realization of entanglement purification was carried out for mixed states of polarization-entangled photons using linear optics [19].
In the entanglement concentration, the process considers the cases where the initial partially entangled state is pure [20,21]. Indeed, there are two ways to implement entanglement concentration: the Procrustean method and the Schmidt projection method [17,20,21]. The Procrustean method is easier to implement than the Schmidt projection method because the initial partially entangled state is known. The entanglement concentration procedure is carried out by local filtering onto individual pairs of the initial state [17]. In the Schmidt method, however, the process of entanglement concentration is implemented in at least two unknown partially entangled states through collective simultaneous measurements onto the particles [22]. Thus, schemes for carrying out the entanglement concentration have been proposed for the Procrustean [23] and the Schmidt method [24,25]. Moreover, its experimental implementation has been achieved in the case of the Procrustean method [26] and for the Schmidt method [22] using partially polarization-entangled photons.
The entanglement concentration can also be classified as deterministic [12,27,28] as well as probabilistic [11,13,20,29]. In the deterministic case, the process has a probability equal to one to be successfully implemented in the regimes of few copies or in the asymptotic limit of infinite copies [30]. In this scheme, the quantum circuits to carry out deterministic entanglement concentration have been proposed [31] and recent experimental efforts demonstrate its feasibility [32][33][34][35]. In these experimental works, the copies are replaced by additional degrees of freedom of the same pair of photons, which improves the possibility of short-term implementations of entanglement distillation for technological purposes. On the other hand, in the probabilistic entanglement concentration, the process is achieved with a probability of less than one and has been experimentally implemented [36]. Moreover, the relation in the asymptotic limit between the entanglement concentration in a deterministic and probabilistic way was studied [30]. They found that these methods are equivalent considering many copies of the initial state: the error probability for the probabilistic method goes to zero quickly with the number of copies. In addition, the entanglement concentration is generally studied considering two entangled quantum states, but has also been studied for the case of tripartite correlated systems [37,38].
In this work, we studied the probabilistic entanglement concentration in the bipartite scenario of a pure two-qudit (D-dimensional) state. Considering a large dimensionality (D 2), we study two methods to achieve entanglement concentration regarding a reasonably good probability of success at the expense of having a non-maximal entanglement. At first glance, we consider a tradeoff between the amount of entanglement of the state after the concentration procedure and its success probability, quantified by the payoff function Q. This figure of merit leads to analytically solving a quadratic optimization problem, ensuring that an optimal scheme for entanglement concentration can always be found in terms of Q. Then, a second method was studied, where we fixed the success probability and searched for the maximum amount of entanglement attainable in this case. We found that both ways resemble the Procrustean method applied to a subset of the most significant Schmidt coefficients without the constraint of obtaining a MES. We envisage the usefulness of these methods in entanglement-based quantum communication and also for device-independent protocols where high-dimensional entangled states are required with a certain amount of entanglement, such as randomness certification, expansion and self-testing [39][40][41].

Revisiting Entanglement Concentration
Throughout this work, we will limit ourselves to the case of entanglement concentration from a single copy of a two-qudit non-maximally entangled pure state. This state will be given by where a m are positive coefficients such that ∑ m a 2 m = 1. The set of states can be regarded as the Schmidt basis for the entangled state |Φ 12 , and, therefore, a m will be the respective Schmidt coefficients. In order to quantify the entanglement conveyed by |Φ 12 , the I-Concurrence [42] can be used, which is given by where ρ 1 is the reduced density matrix of one of the qudits. This function fulfills the necessary conditions an entanglement measure needs to satisfy [43]. Its minimum value is zero, and its maximum is one, which arises when |Φ 12 is a product state and a maximally entangled state, respectively. This document will refer to C simply as entanglement.
Another function widely used to assess entanglement is the Schmidt number [44][45][46][47][48][49][50][51], defined as ( It is straightforward to see that C(|Φ 12 ) and K(|Φ 12 ) are closely related, as both depend on tr ρ 2 1 . As we mentioned above, it is well known the correlated state given in Equation (1) can have its entanglement increased through an entanglement concentration procedure [7,13,30,52,53]. This process is, in general, a probabilistic one [54]. We will follow the next approach to show the concentration scheme. Assuming we have an ancillary qubit initially prepared in state |0 a , it can be used for concentration through a unitary bipartite operation U a1 acting over the ancilla and one of the qudits. Let where |µ a is the state of the ancilla which flags whether concentration was accomplished (µ = 0) or not (µ = 1). A S and A F are Kraus operators acting on qudit 1, modifying the entangled state in each case. A measurement on the ancilla announces if we succeeded. Through this work, we will be only concerned with the successful cases, whose study can be simplified considering A S |Φ 12 only. Without loss of generality, we may write where p S is the probability of success for the concentration procedure, and |Ψ 12 is the resulting state; therefore, we obtain C(|Ψ 12 ) > C(|Φ 12 ). If the intention is to obtain a MES, it is known that p S = Da 2 min , where a 2 min = min{|a m | 2 } [7,53]. This probability, however, may adopt very small values if the Schmidt coefficients exhibit large differences among them, rendering the procedure inefficient.
Alternatively, one may increase the success probability at expense of having a partially entangled state as result. In Ref. [20], Vidal studied the case of transforming Schmidt coefficients {a m } onto a given set {b m } and showed the optimal probability of success for such map. In this way, one may choose the b m coefficients in such a way that the success probability is good enough at the same time the entanglement is increased. Another possibility is to set the resulting state |Ψ 12 as a maximally entangled one for a subspace of dimension N D, which is analogous to a Procrustean method (i.e., cutting off extra probabilities from a given reference value [14]) applied only on a subset of the original Schmidt coefficients [13]. Both approaches, however, force one to constrain the final state to be a given one. Thus, the problem contains D arbitrary parameters b m , and one has to search thoroughly for a convenient combination of the b m .
A possible way to decrease the number of free parameters is to use the Kraus operator A S (ξ) given in Ref. [55]. This approach allows to interpolate between the initial Schmidt coefficients (a m ) and the ones from a maximally entangled state (1/ √ D) using a single parameter ξ. Thus, we may transform a m → b m (ξ), where 0 ξ 1, and It can be seen that Equation (6) shows a transformation that preserves the norm of the new state and represents a linear interpolation for the squares of the Schmidt coefficients.
Moreover, the success probability is p(ξ) = 1 − ξ + ξ/(Da 2 min ) −1 [55]. This method, although straightforward to understand, leads to little improvement in terms of success probabilities. For instance, Figure 1 evidences that even a little improvement in any of the functions used to assess entanglement is achieved at the expense of a substantial drop in the success probability. This figure also evidences that the I-Concurrence, although simple to work with because it is not a rational function, is not good for graphical assessment since even initial I-Concurrence (see ξ = 0) exhibits values close to one. Instead, the Schmidt number is not simple to work with due to its inverse dependence on tr ρ 2 1 but makes graphical evaluation uncomplicated. These previous attempts lead us to question whether a method can obtain a reasonable increment in entanglement with a non-negligible success probability without imposing constraints on the final state beforehand. The next sections will address this question.

Towards Efficient Entanglement Concentration
Here, we shall propose and analyze a more efficient method for entanglement concentration from a single copy of a partially entangled pure state. Let us define a parameterized Kraus operator A S ( z) being applied on one of the qudits. This operator can be written as so its action on the two-qudit system after successful concentration will be Thus, keeping Equation (5) in mind, the post-concentration state and its probability of success are respectively. Since p S ( z) must not exceed one, it is mandatory to impose |z m | 1. The reduced density matrix for one of the subsystems shall be I-Concurrence and Schmidt number, as function of z, will be given by Let us now define a quantity Q( z) aimed to assess the efficiency of the concentration procedure considering a trade-off between the probability of success and the increment in entanglement. A Kraus operator that maximizes this efficiency will be pursued. A choice, although not unique at all, might be p S ( z)C( z). Maximizing it will be equivalent to maximizing its square, [p S ( z)C( z)] 2 , which should be a simpler procedure since the square root we can see in Equation (12) will not be present. However, [p S ( z)C( z)] 2 has its maximum when z m = 1, ∀ m, which means state |Φ 12 will be kept unaltered (This will be proven in Appendix A). Instead, we may try with the difference between C 2 ( z) and a constant reference level for the I-Concurrence (C 2 REF ). This reference level could be, for instance, the initial value C INIT = C(|Φ 12 ). Let us try by defining an efficiency function such as Equations (10) and (12) allow us to transform Equation (14) into where P REF has been defined for mathematical convenience. It ranges from 1/D to 1, and it can be interpreted as a reference value for the purity of a reduced density matrix, as it can be seen from Equation (2). Another interpretation, as can be seen from Equation (3),  (15) leads us to infer that the problem of efficient entanglement concentration, in the form it has been described in this document, can be rewritten as a quadratic optimization problem given by subject to 0 y m 1, where Therefore, the problem of efficient entanglement concentration for a single pair of entangled qudits can be written as the quadratic optimization problem described in Equations (17a)-(17d), with the optimization variables y m lying in a unit hypercube. Finally, without loss of generality, we may choose the positive root of z m = √ y m . Note that the presence of C REF forces the optimization to look for a solution y OPT such that C( y OPT ) C REF .
Otherwise, function Q( y OPT ) would adopt a negative value [see Equation (14)] and, therefore, it will not represent a maximum. For this reason, we can assure that C REF C INIT forces entanglement concentration. In an extreme case, C REF = 1 means that the reference level is equal to the maximum possible value I-Concurrence can adopt. Therefore, Q( y) will adopt a negative value unless the final entanglement is also equal to one, for which Q = 0. This is the standard entanglement concentration procedure. On the other hand, C REF could be slightly smaller than C INIT and, still, entanglement concentration may occur, as it will be shown in Section 4.1. For this problem, the square of the I-Concurrence has been used also because both numerical and analytical solutions are accessible. For graphical purposes, as it was already seen in Figure 1, the Schmidt number shall be used. Moreover, the Schmidt number provides an estimation of the number of relevant Schmidt modes involved [45]. We must add that the Kraus operator defined in Equation (7) is diagonal in the Schmidt basis. We may have started by a general Kraus operator, instead of a diagonal one. However, Appendix B shows it suffices to look for diagonal operators. Figure 2 shows the results of numerical resolution of the aforementioned optimization problem for a given set of D = 16 Schmidt coefficients a 2 m , randomly chosen, and sorted decreasingly in order to ease observation. For this example, we tested four possible values of C 2 REF , given by (i) C 2 INIT /2, much smaller than the initial entanglement; (ii) 0.98C 2 INIT , slightly smaller than the initial entanglement; (iii) average value between C INIT and 1, a significant increase in entanglement; and (iv) C 2 REF = 1, the maximum possible value for C 2 REF . The optimization was performed using the function QUADPROG of Matlab R2022b. Since this is a non-convex problem with constant bounds only, the algorithm "trust-regionreflective" was used since it was the best suited for our optimization problem [56].

Numerical Hints
The plots show the original Schmidt coefficients (cyan) and the non-normalized coefficients after concentration (dark red). A pattern is evident. For small values of C 2 REF , keeping the state as it is seems to be the best option in terms of efficiency. As C 2 REF increases, the solutions of the optimization problem suggest one to use a Procrustean method on the n largest Schmidt coefficients, where n increases according C 2 REF moves closer to one. This is analogous to entanglement concentration on a subspace of the bipartite Hilbert space as the one proposed in Ref. [13], although we have not required the final state to be fixed to a given one. Finally, C 2 REF = 1 represents the ideal entanglement concentration context, in which the resulting state exhibits the maximal entanglement possible. The optimization problem shows the correct result, which consists of uniforming all post-concentration Schmidt coefficients. Although Figure 2 shows a single set of initial Schmidt coefficients, the same pattern is observed for other states in any dimension D > 2. In the following, we shall prove why the Procrustean method on a subspace is the most efficient method, according to our figures of merit.

Analytical Results
One of the goals of this work is to find the analytical solution of the optimization problem of Equation (17). The details of the proof will be shown in the next subsections. The procedure can be summarized as follows: 1.
If P REF = 1/D (minimum attainable value, equivalent to C REF = 1), it means we are pursuing a standard entanglement concentration using all Schmidt coefficients. Then, perform concentration using z m = a min /a m . Otherwise, follow Steps 2-8.

2.
Sort the Schmidt coefficients in decreasing order. Let us label these sorted coefficients as a m .

5.
Find the largest value of n that allow both α n a 2 n and n < 1/P REF to be simultaneously satisfied. Let us label this value as n OPT . 6.

7.
Define y m = x m /a 2 m . Afterwards, sort the y m using the inverse of the sorting operation described in Step 1. These sorted values will be the y m that solve the optimization problem of Equation (17). 8.
Define z m = √ y m . These values are the ones needed to construct the Kraus operator of Equation (7).

Redefining the Optimization Problem
In order to prove the solution detailed above, we shall define x m = a 2 m y m = a 2 m |z m | 2 . This allows us to write the optimization problem [Equation (17)], up to a proportionality constant, in a simpler way: These new variables x m are the ones plotted in Figure 2 using dark red bars. Thus, the x m will provide an idea about the post-concentration Schmidt coefficients. The domain is no longer the unit hypercube, but an orthotope whose vertices have coordinates components equal to zero and a 2 m . Thus, every x m has three options: (i) having a fixed value equal to zero, (ii) having a fixed value equal to a 2 m , and (iii) having a variable value between zero and a 2 m . These options had to be taken into account in order to find all critical points.

Finding Critical Points
For starters, we shall define set of indices according to the aforementioned options: The symbols Z, O, and I stand for zero, outer, and inner, respectively. In this way, any summation can be written as ∑ m = ∑ j∈Z + ∑ k∈O + ∑ ∈I . There exist 3 D configurations for (Z, O, I). If we label each of those 3 D combinations by using the index µ, then we can define function Q µ ( x) as the function Q( x) for the µth configuration. Explicitly, By imposing ∂ x r Q µ ( x) = 0, we can find the critical points of Q µ ( x). Consequently, This means that as long as x r is not fixed in either 0 or a 2 r , the optimal solution is such that those x r all adopt the same value. Let us define some additional ancillary parameters, n µ being the number of free parameters x . With these definitions, we can now assert that x = α µ is the critical point for the µth configuration, where Consequently, if Q µ is the value of Q µ ( x) evaluated at the µth critical point, then The fact that x = α µ means that, for every ∈ I µ , coefficients a 2 will be transformed into α µ as consequence of the concentration procedure. This is, precisely, the Procrustean method applied on a n µ -dimensional subset of the coefficients {a m }. It is worth mentioning that Equation (22) contains the implicit assumption P REF = 1/n µ , which raises questions regarding the case P REF = 1/n µ . If that were the case, trying to solve Equation (20) leads us to conclude β µ = 0 and, equivalently, O µ = ∅. In turn, this implies Q µ ( x) = 0. Nevertheless, we may see from the original definition of Q( z) [Equation (14)] that the only possible way in which Q µ ( x) = 0 represents a maximum occurs when C 2 REF = 1 and C 2 ( z) = 1 simultaneously, i.e., P REF = 1/D has been set and the resulting state is a D-dimensional maximally entangled state.

Upper Bounds for n µ
The Hessian matrix has components given by It can be shown that Q µ will represent a local maximum for the µth configuration provided, (1 − n µ P REF ) > 0, since this condition ensures Hessian matrix to be negative-definite. In other words, Thus, some configurations (Z µ , O µ , I µ ) can be immediately discarded if n µ exceeds this bound.

Eliminating Zeros
Let us start by analyzing the effect of zeros by comparing a given Q µ -for which x r = 0-with the value of Q µ ( x) when x r = δ 0. Using Equation (19), we have that Q µ which, in turn, leads us to We can see that Q µ actually grows if x r moves away from zero within its neighborhood. This means that every configuration containing a null value on any of its x m cannot represent a maximum since all neighboring points have higher values for Q( x). Therefore, the solution we are looking for is such that Z µ = ∅. The number of remaining configurations is now less than 2 D .

Optimal n Will Be the Largest Possible
We are left with the options x m ∈ {α µ , a 2 m }. We know that the µth critical point is such that x = α µ , ∀ ∈ I µ . Since x still belongs to the orthotope, an additional condition arises: α µ a 2 , ∀ ∈ I µ .
Let us now compare two solutions Q λ and Q ν , whose critical points differ only in one term x r , so r ∈ O λ and r ∈ I ν . Thus, by using Equations (21)-(23), we have that Consequently, Therefore, a better solution is obtained when r belongs to I ν over O λ , provided that the constraints are fulfilled. In simpler words, the best of the {n µ } will be the largest possible within the conditions n µ < 1/P REF and α µ a 2 , ∀ ∈ I µ .

Sorting Preference
For the following comparison, it will be helpful to define two sets O 0 and I 0 . We will center our attention on two values x r and x s . Now, let us compare two solutions Q ρ and Q σ that satisfy Thus, I ρ and I σ have n − 1 elements in common, whereas O ρ and O σ have D − n − 1 elements in common. Consequently, where β 0 = ∑ k∈O 0 a 2 k and γ 0 = ∑ k∈O 0 a 4 k . For the following, we shall assume a r > a s . Now, since both Q ρ and Q σ are admissible solutions, it must happen that α ρ a 2 r and α σ a 2 s as consequence of Equations (18), (22), and (37). This means t(β 0 + a 2 s ) a 2 r , and t(β 0 + a 2 r ) a 2 s , is a positive parameter. If we add these two inequalities, we obtain The difference between the solutions Q ρ and Q σ is Since a r > a s was assumed and the inequality of Equation (42) was obtained, it can be assured that Q ρ > Q σ . Now, let us remember that Q ρ is the solution in which x r = α ρ and x s = a 2 s . This means it is better to cut off coefficient a r (the larger one) over a s . Since we already know (see Section 4.2.5) that n must be the largest possible within the constraints n < 1/P REF and α µ a 2 , ∀ ∈ I µ , we must compare now all the solutions Q µ such that n µ is equal to that optimal value of n. According to the computations of this section, the most efficient concentration scheme will consist in cutting off the n largest Schmidt coefficients, which is in complete agreement with the results shown in Figure 2.

How to Construct the Optimal Concentration Scheme
In summary, we know now that if C REF = 1 (equivalently, P REF = 1/D), then the optimal solution corresponds to a entanglement concentration procedure that yields a D-dimensional maximally entangled state. On the other hand, if C REF < 1 (equivalently, P REF > 1/D), we have shown that the optimal solution (i) does not contain zeros, (ii) it has values either given by x m = a 2 m (i.e., keep a m as they are) or by x m = α µ (i.e., crop coefficients a m to a given value α µ ), (iii) the n largest Schmidt coefficients are to be cropped, and (iv) n must be as large as possible within constraints given by n < 1/P REF and α µ a 2 m . Once the optimal x m are found, we may compute the corresponding y m and z m . These rules gave rise to the algorithm described at the beginning of Section 4.2. Moreover, we performed thousands of numerical simulations, ranging from D = 32 to D = 1024, that confirmed such an algorithm actually provides the optimal solution. Figure 3 shows a sample of those simulations for D = 1024, depicting relative differences between the results from numerical optimization ( y num and Q( y num )) and the ones from the algorithm proposed in this section ( y alg and Q( y alg )) for 100 values of P REF . These relative differences are computed as The initial Schmidt coefficients were computed from a randomly-generated D × D entangled state. As the data of Figure 3 show, relative differences between the two solutions being compared are negligible, thus demonstrating the adequateness of the proposed algorithm. Discrepancies can be explained as a consequence of floating-point computation precision. After efficiency optimization, one should evaluate whether practical advantages were obtained from it. Figure 4 shows the probability of success and Schmidt number for the same optimizations carried out for Figure 3. The initial state had a Schmidt number K INIT ≈ 512. Raising this number to its maximum (i.e., K = 1024) can be done with a probability of success p S = Da 2 min ∼ 10 −7 (not shown in the graphs in order to ease observation). However, non-maximal Schmidt numbers can be obtained with much better probabilities. For instance, P REF ≈ 1.15 × 10 −3 allows one to achieve a considerable Schmidt number (K = 900) with a success probability p S = 11%. Although P REF ≈ 1.15 × 10 −3 seems to be a non-trivial number of uncertain origin, we may notice that 1/P REF ∼ 868. Thus, an acceptable method to estimate the necessary value of P REF consists in setting a minimum desirable Schmidt number K MIN , define a slightly smaller threshold number K THR < K MIN , and computing P REF = 1/K THR .
It is worth mentioning that the solution described in this section closely resembles the entanglement concentration procedure described in Ref. [13], which was also graphically explained in Ref. [30]. However, we did not set the final state to a fixed one in our formulation. Instead, we defined a single figure of merit to be interpreted as efficiency, and its optimization suggested performing entanglement concentration on the subspace of the largest Schmidt coefficients.

Entanglement Concentration with Fixed Probability of Success
An alternative way to solve the problem of efficient entanglement concentration is by setting the success probability to a fixed value p FIX and inquiring about the largest entanglement that can be extracted. As it can be seen from Equations (12) and (13), this question reduces to minimization of the purity of the reduced density matrix, as where we have already used y m = |z m | 2 . As we have imposed p S ( y) = p FIX , the optimization reduces to optimize ∑ m a 4 m y 2 m . As in the previous section, we shall resort to x m = y 2 m and the sets of indices Z µ , O µ , and I µ . Using the x m , we are left to optimize ∑ m x 2 m , and the constraint of fixed probability can be rewritten as ∑ m x m = p FIX , which also allows us to write one of the variables in terms of the others. Let Then, the minimization of the purity can be rewritten as Critical points are found by setting ∂(p FIX P ( x))/∂x r = 0, with r ∈ I µ and r = ϑ. This leads us to x r = κ µ , where In turn, Equation (47) implies that x ϑ = κ µ as well. Thus, we obtained solutions given by either x m = a 2 m , x m = 0, or x m = κ µ , which is the exact behavior exhibited by the x m from Section 4 up to a change from α µ to κ µ . The same analysis performed in Sections 4.2.4-4.2.7 can be applied here. The conclusions are very similar: (i) the optimal values of x m are different from zero, (ii) if n is the number of variables x m being equal to κ µ , then n must be as large as possible within the constraint 0 κ a 2 , and (iii) the n largest Schmidt coefficients are cut off. Thus, an algorithm can be constructed as follows: 1.
Sort the Schmidt coefficients in decreasing order. Let us label these sorted coefficients as a m .

4.
Find the largest value of n such that κ n 0 and κ n < a 2 n are simultaneously satisfied. Let us label this value as n OPT . 5.

6.
Define y m = x m /a 2 m . Afterwards, sort the y m using the inverse of the sorting operation described in Step 1. These sorted values will be the y m that solve the optimization problem of Equation (17). 7.
Define z m = √ y m . These values are the ones needed to construct the Kraus operator of Equation (7).
As it can be seen, the solutions obtained for this problem are completely analogous to the ones of the previous section. The advantage of this approach lies in the fact that P ( x) appears in both I-Concurrence and Schmidt number. Thus, it is a favorable way to increase the Schmidt number without introducing nontrivial mathematical complications. Once more, this result represents a Procrustean method applied on a subspace, although only one parameter has been fixed (p FIX ) instead of a whole state.

Conclusions
In summary, we have studied entanglement concentration from a single copy of a two-qudit entangled state in terms of efficiency. As the ideal procedure-obtaining a maximally entangled state-is extremely inefficient in terms of probability, we studied the possibility of concentrating a fair enough amount of entanglement and, simultaneously, increment the success probability. Two methods were analyzed. For the first one, a function Q( y) was defined in order to quantify efficiency as the product of success probability and entanglement increment. This function allows one to introduce a parameter P REF , which is loosely related to a minimal entanglement amount intended to extract. The other one consisted of fixing the success probability to a given value and finding the maximal entanglement it can be extracted under the constraint herein. We found that, for both cases, the solution resembles a Procrustean method applied on a subset of the largest Schmidt coefficients. Such application of the Procrustean method has been already studied in the literature under the assumption that the final state must be a n-dimensional maximally entangled state, with n < D. Therefore, n constraints are implicitly assumed. Instead, this work does not impose constraints on the final state. In the first method, the Procrustean method results as consequence of a quadratic optimization problem. In the second one, it emerges after optimizing entanglement and using a single constraint.
We anticipate that this work may be useful for understanding how to concentrate entanglement efficiently in very large dimensions. As entanglement is a resource underlying many protocols in Quantum Information Science, we believe many people in the Quantum Information community may benefit from these findings.

Data Availability Statement:
No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Why Is It Necessary to Add a Difference?
In Section 3, we asserted that [p( z)C( z)] 2 has its maximum when z m = 1, ∀ m. This means to keep the original state unaltered, without making any attempt to concentrate entanglement. In order to prove it, let us remember Equations (10) and (12). We may observe that (1 − δ mn ) 1 − |z m | 2 |z n | 2 a 2 m a 2 n 0, because |z m | 1. Thus, straight optimization of p 2 ( z)C 2 ( z) will suggest to do nothing and, instead, keep entanglement as it is. For this reason, it is necessary to add a reference level for entanglement. In other words, it is better to optimize p 2 ( z) C 2 ( z) − C 2 REF rather than maximizing solely p 2 ( z)C 2 ( z) in order to actually increment entanglement.

Appendix B. Why Does a Diagonal Kraus Operator Suffice?
In Equation (7), we assumed A S ( z) to be diagonal in the {|m } basis. This section will show why nondiagonal terms do not increase efficiency. Let us redefine A S to be a general operator with components ζ mn . We will add an additional definition. Let Π(ζ) = A † S A S be a positive operator whose matrix components are π mn = ∑ j ζ * jm ζ jn and satisfy π * mn = π nm and π jj 0. If Π is known, then A S = U √ Π, where U is an arbitrary unitary operator whose explicit form depends on experimental details about the physical implementation of A S Now, considering that A S = U √ Π, Equations (9) and (10)  It can be seen that Q(ζ) does not depend on U. In addition, the only positive term on the RHS of Equation (A1) depends on the diagonal components π mm , whereas nondiagonal components only diminish the efficiency. Consequently, the optimal operator Π must be diagonal. This last condition can be satisfied, although not uniquely, by imposing A S to be diagonal, so Equation (7) suffices to find the adequate operation to optimize the function Q.